Web Domains with no Legal Deposit

   Critically Endangered small

This entry regards the preservation of websites and domains that fall outside a remit of legal deposit (or no legal deposit mandate exists). Web archiving is able to capture large quantities of materials with routine and standards-based tools, but there are significant issues arising with intellectual property rights associated with website capture and republication. In many jurisdictions, but by no means all, those obstacles are overcome by regulations that enable a national library or other ‘legal deposit’ agency to copy and preserve content. Where no such permission exists, there is a significant risk of loss.

Digital Species: Web

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop.

Examples

Domains registered without a country code; domains with a country code but weak or unenforceable legal deposit permission to harvest.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Uncertainty over IPR or the presence of orphaned works; lack of legal deposit mandate or remit; rapid churn of websites; lack of access to Internet Archive harvest; contentious content; encryption; digital rights management; non-standard content management.

‘Endangered’ in the Presence of Good Practice

Permissive approach to Legal deposit; legislation to support and/or manage associated risks.

2023 Review

This entry was added in 2019. It is characterized by regulatory barriers rather than technical ones, though the pace of change in web technologies, as well as the growth of web content, means that significant technical challenges still exist. The 2019 Jury noted that local conditions were also a significant factor; for example, websites often also fall under public records legislation or are important elements of corporate records, and so important parts of the web are harvested even when there is no explicit legal deposit legislation. The 2019 Jury particularly recognized the work of the Internet Archive to capture and preserve content. They noted significant gaps in web archiving and, in too many cases, regulation as the barrier. The 2021 Jury agreed with this description and classification but added that in some limited instances, pywb tools (as opposed to automated web crawlers like Heritrix) could effectively capture the look and feel of a platform interface, preserving legacy versions for users to interact with in the future. However, pywb tools are manual and, therefore, cannot address the scale of the issue. They also do not capture interfaces in a way that makes it possible to recreate them in the future, only interact with a defined set of web pages. For this growing issue of scale, the 2021 Trend was towards greater risk. The 2022 Taskforce agreed with noted no change to the trend.

The 2023 Council agreed with the Critically Endangered classification. They also noted an increase in the imminence and inevitability of loss, recognizing that while the need for major efforts to prevent or reduce losses continues, it is much more likely that loss of material has already occurred and will continue to do so by the time tools or techniques have been developed. While the Council agreed the entry description should be updated to reflect these areas of discussion, overall risks remain on the same basis as before (‘No change’ to trend).

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Council members also added that the presence of a clear IPR framework for preservation is enabling, whether it is through legislation (like legal deposit) or licensing.

Additional Comments

There is not only a significant risk of loss to the content but also risk of loss to access. Unless the Internet Archive is picking these up, the early web or permission regimes are in place, and these early instances are gone forever and will continue to be lost.

See also:

  • In the 2023 NDSA Web Archiving Survey Report, one of the major takeaways was that respondents indicated that ‘staff capacity’ was the biggest barrier to web archiving. Few organizations dedicate more than one, full-time employee to web archiving and very rarely is someone’s entire position dedicated to web archiving. See: National Digital Stewardship Alliance (NDSA) (2023) Web Archiving Survey Results: An NDSA Report. October 2023. Available at: https://doi.org/10.17605/OSF.IO/N5MYR  [accessed 11 September 2024]


Scroll to top